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ABSTRACT 

In a hospital, clinical teachers must also care for patients, so there is less time for the teaching of clinical courses, or for 
discussing clinical cases with interns. However, electronic learning (e-learning) can complement clinical skills education 
for interns in a blended-learning process. Students discuss and interact with classmates in an e-learning collaborative 
environment. E-learning can assist clinical training and provides a collaborative environment, but every student has 
individual learning preferences on the e-learning platform. A typical platform, such as a learning management system 
(LMS) does not provide individual learning activities for every student. This paper clusters students into two groups: 
active and inactive groups. In each group, students’ learning behavior patterns, i.e., the association rules for activities, are 
derived from the transaction data for the LMS. The cluster to which a student belongs defines the online learning 
behaviors, from the activity association rules. The method then provides individual preferred activities. Teachers instruct 
students in accordance with their aptitude, as derived from the learning behavior pattern. The cluster analysis shows that 
students in active group often view teaching videos after completing feedback. Students in the inactive group often view 
teaching materials after adding posts on a forum. 
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1. INTRODUCTION 

In a hospital, clinical teachers must devote time to patients, so they have less time to teach clinical courses, or 
to discuss clinical cases with interns (Prideaux et al., 2000, Ramani and Leinster, 2008). However, e-learning 
can be a complement to clinical skill education for interns (Ruiz et al., 2006). Most medical students feel that 
e-learning has a positive impact on the acquisition of clinical skill and knowledge. It is an integrated, blended 
approach (Gormley et al., 2009). Students who use an e-learning platform as a complement to in-classroom 
education obtain higher scores in the final examination (Seluakumaran et al., 2011). Blended learning, which 
is clinical learning combined with e-learning, results in a greater acquisition of knowledge in radiology 
internship (Mahnken et al., 2011). An e-learning course in radiology gives greater knowledge acquisition 
than a standard lecture-based course and is a cost-effective alternative to standard lecture-based teaching 
(Hadley et al., 2010). 

Medical education in radiology includes undergraduate, postgraduate and continuing education. The 
existing e-learning platforms for radiology are on-line neuroradiology education resources (NeuroRAD) 
(Sparacia et al., 2007), the American Association of Physics in Medicine (AAPM) and the Radiological 
Society of North America (RSNA) (Brambilla et al., 2011). These platforms were developed for physicists 
and radiologists in continuing education, but few platforms have been developed for undergraduate radiology 
internship. Radiology clinical training includes image diagnosis, nuclear medicine, radiation oncology, 
quality assurance and safety examination, which require discussion and interaction with a teacher and 
classmates. In an e-learning collaborative environment, radiology interns can interact with their classmates at 
anytime, to construct their own knowledge base(Brambilla et al., 2011). Clinical teachers provide the 
radiological content, such as image banks and special clinical cases (Perriss et al., 2006). Radiology interns in 
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different areas discuss their cases with others by posting on a forum. This study establishes an e-learning 
platform to assist clinical skill training. 

E-learning can assist clinical training and provides a collaborative environment, but every student has 
individual learning preferences on the e-learning platform. A typical platform, such as the learning 
management system (LMS), does not provide individual learning activities for every student. In order to 
provide an individual learning environment, data mining technology is applied to educational systems 
(Romero and Ventura, 2007). This paper clusters students into two groups, based on their activity 
preferences: active and inactive groups. Students in the active group, often view the course, complete 
feedback, add posts, update posts and view discussions on the forum. All of the activity association rules for 
the two groups were retrieved, in order to determine students’ learning behavior patterns, for each cluster. 
For example, active group students often view teaching videos for the lessons, after viewing the course and 
completing feedback. They also prefer to write messages to other classmates to communicate emotions. 

The remainder of this paper is organized as follows. Section 2 discusses related studies. Section 3 
describes the proposed method. Section 4 presents the experimental results. Section 5 summarizes the 
findings, states the limitations of this study. 


2. RELATED STUDIES 

Two data mining techniques are used: clustering technique and association rules. Some data mining 
applications for e-learning are also described. 

2.1 Clustering Technique 

Clustering techniques, which are usually used to segment markets (Punj and Stewart, 1983), seek to 
maximize the variance between groups and to minimize the variance within groups. A number of clustering 
algorithms have been developed, such as K-means, hierarchical and fuzzy c-means algorithms (Omran et ah, 
2007). K-means clustering (MacQueen, 1967) is a widely used similarity grouping method that partitions a 
dataset into k groups. The K-means algorithm assigns instances to clusters, based on the minimum distance 
principle. An instance is assigned to a cluster based on the minimum distance to the center of the cluster, over 
all of the k clusters. 

2.2 Association Rules 

Association rule mining determines the associations between two sets of products in a transaction database. 
Agrawal et al. (1993) formalized the problem of determining association rules that satisfy the minimum 
support and the minimum confidence requirements. For example, if a set of purchase transactions includes a 
set of product items I, an association rule is an implication of the form, X => Y, where X cl, Y cl, and X D 
Y = O. X is the antecedent (body) and Y is the consequence (head) of the rule. Two measures, support and 
confidence, are used to determine the quality of an association rule. The support of a rule is the percentage of 
transactions that contain both X and Y and the confidence of a rule is, the fraction of all transactions that 
contain X that also contain Y. 

2.3 Data Mining Applications for E-learning 

Data mining technologies include on-line analytical processing (OFAP), clustering, association rules and 
classification and visualization (Zalane, 2002, Talavera and Gaudioso, 2004, Zorrilla et al., 2005, Mostow 
and Beck, 2006, Romero and Ventura, 2007, Romero et al., 2008). For example, Zorrilla et al. (2005) built a 
web log data cube for OFAP operation, to analyze the log to obtain the information that allows teachers to 
evaluate the learning process. Talavera and Gaudioso (2004) clustered users into groups, to determine their 
behavior patterns and evaluations. Mostow and Beck (2006) developed a listening tool that uses visualization 
technology to help children to decode words and comprehend stories. Zalane (2002) used association rules 
that use a learner’s access history to recommend on-line learning activities, or shortcuts on a course website. 
Romero et al. (2008) used clustering, association rules and classification technologies to discover knowledge 
from a learning content management system. 
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3. METHODOLOGY 


This section proposes a method to understand learning behavior patterns for interns, which is shown in 
Figure 1. The activity usage count for each student on LMS was calculated. The missing values were filled 
using proper data preprocessing. Because some activities are used by few or no students, feature selection 
was applied to some activities. These activities were not taken into account. The continuous values for the 
activity usage count were then transformed into discrete preference values, to form a student-activity 
preference matrix. 

The K-means clustering method was then used to cluster students into activity preference groups, based 
on the similarity between students’ activity preferences, which were measured using Pearson’s correlation 
coefficient, as shown in Eq.(l). The r\F and denote the average rating score of all activities used by 
students Sj and Sj respectively. The variable I denotes the mix of the set of activities. The r s . A and r s . A 


denote the rating score that students S; and Sj used activity A. 

S A el(rs i ,A-r5i)(rs j ,A-riy) 


corr(Sj, Sj) = 


ZAel( rSiA" 1 ^) SAel( r Sj,A — r Sj) 


( 1 ) 


In each group, students’ online learning behavior pattern, the association rules, were derived from the 
transaction data of the learning management system (LMS). The cluster to which a student belongs defines 
the online learning behavior pattern, based on the association rules in that cluster. The method shows the 
preferred activities for every student. 



Figure 1. An overview of the proposed method 
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3.1 Data Preprocessing 

The students and activities transaction data from the learning management system (LMS) were firstly pre- 
processed. The linear method was used to fill the missing values. Feature selection was then applied. There 
are many learning activities on the learning management system, but some activities are never or seldom 
used. The number of the selected activities is 18. They include viewing the course, folder, discussion, 
resource, video, completing feedback, adding and updating posts on a forum, writing messages and updating 
user profiles. 

3.2 Activity Preference Scores of Students 

The usage counts for the selected activities are continuous values. Eq. (2) (Lin et al., 2003) is used to 
transform the continuous values to discrete values, i.e., -1, 0, or 1. 

Z = ^ (2) 

°X 

where X is the activity usage count, x and ct x are, respectively, the mean value and the standard deviation of 
the activity usage count and Z is a semantic variable. 

All of the continuous usage count values were normalized to discrete preference scores (PS) and 0.3 is 
selected to cluster students into suitable groups by using Eq. (2) with Z < -0.3, -0.3 A Z A 0.3, and Z > 0.3, 
to respectively represent inactive, neutral and active preferences. The preference score (PS) is the degree of 
preference that a student demonstrates for an activity, which is defined as in Eq. (3). The preference score is 
1 if the usage count, X >x + 0.3 cj x , and the preference score is -1, if the usage count, X <x- 0.3 rr x . 0 
represents a neutral preference. 

( 1 , when X > x + 0.3cr x 

Preferece Score (PS) = I —1, when X < x — 0.3 ct x (3) 

v 0, otherwise 

The student-activity preference score matrix is shown in the following Table 1. 


Table 1 . Student-activity preference score matrix 


Student ID 

View 

courses 

Complete 

feedback 

Add post 
on forum 

Update post 
on forum 

View discussion 
on forum 

1 

-i 

0 

1 

0 

1 

2 

0 

-1 

0 

-1 

1 

3 

1 

0 

-1 

1 

0 

4 

0 

1 

0 

1 

-1 


4. EXPERIMENTAL RESULTS 
4.1 Experiment Setup and Dataset 

A dataset obtained from the learning management system (LMS) of a hospital was used for the experiment. 
The hospital is a medical center in northern Taiwan. This study was approved by the hospital ethics 
committee. Clinical teachers added learning resources (course, teaching material videos, or questionnaires) 
and activities (forum, chat room). Students viewed the courses, talked in chat rooms, discussed on forums 
and completed feedback on the LMS. The details are shown in Figure2. 

The experiment dataset was extracted from the learning management system of the hospital, for the first 
semester of 2013. There are 10,637 items of student activity transaction data on the LMS. The dependent 
variable, support and confidence thresholds for the association rules were set to 0.5, 0.2 and 0.6. These 
thresholds were set based on our observation on the online learning behaviors of the students. The most 
popular activities are viewing the course, completing feedback, adding and updating posts and viewing 
discussions on a forum. 
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Figure 2. Knowledge sharing (left) and a discussion on a forum (right) on the e-learning platform 


4.2 Student Cluster Identification 


It is firstly necessary to identify the characteristics of each student cluster, i.e. the activity preferences for 
each student cluster. The average usage count for activity X in a cluster is compared to the average usage 
count x plus/minus the standard deviation, o x , for all students, to obtain the cluster preference score 
(PS).using Eq. (2). The preference score for an activity in the cluster ID is PS(X^ c “^f t r y ID ). For example, 
PS(Xy iew course ) = —1 and PS(Xy iewcourse ) = 1. The activity preference scores for two clusters are shown 
in Table 2. The preference scores for activities such as viewing the course, completing feedback, adding 
posts, updating posts and viewing discussions on a forum in cluster 0 are PS( Xy iewcourse ) , 
PS(Xcomplete feedback)’ f^^Add post)’ f^(^Update post)’ PS(Xyj ew discussion), which is (-1, -1, -1, -1, -1) and 
the preference scores for these activities in cluster 1 are PS( Xy iew course) - PS( Xc omplete f eedback ) , 

PS(XAdd post)’ P^^Xupdate post)’ PS(Xy; ew discussion)’ which is (1, 1, 1, 1, 1). 


Table 2. Activity preference scores for students in the two clusters 


Cluster ID 

Activity 

Cluster 

All 

Preference 

X 

X 

X - 0.3 a x 

X + 0.3 o x 

Score (PS) 


course 

recent 

3.0 

3.0 

2.4 

3.6 

0 


view 

269.3 

399.0 

319.8 

478.2 

-1 


feedback 

complete 

7.0 

9.0 

7.8 

10.2 

-1 


folder 

view 

12.0 

10.0 

7.9 

12.1 

0 

0 


add discussion 

2.0 

2.0 

2.0 

2.0 

0 



add post 

38.3 

53.0 

44.3 

61.7 

-1 


forum 

delete post 

1.0 

1.0 

1.0 

1.0 

0 



update post 

1.0 

3.0 

2.4 

3.6 

-1 



view discussion 

294.3 

402.0 

332.7 

471.3 

-1 


course 

recent 

2.5 

3.0 

2.4 

3.6 

0 


view 

519.3 

399.0 

319.8 

478.2 

1 


feedback 

complete 

14.7 

9.0 

7.8 

10.2 

1 


folder 

view 

13.3 

10.0 

7.9 

12.1 

1 

1 


add discussion 

3.0 

2.0 

2.0 

2.0 

1 



add post 

78.5 

53.0 

44.3 

61.7 

1 


forum 

delete post 

1.5 

1.0 

1.0 

1.0 

1 



update post 

5.5 

3.0 

2.4 

3.6 

1 



view discussion 

531.3 

402.0 

332.7 

471.3 

1 
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The usage counts for these activities for cluster 1 are twice those for cluster 0. Cluster 0 is an inactive 
group and cluster 1 is an active group. Students in the inactive group (cluster 0) seldom view the course, 
complete feedback, add posts, update posts, or view discussions on a forum. However, the active group 
(cluster 1) students have the opposite learning behaviors. Table 2 shows that the usage counts for adding 
posts, updating posts and adding discussions for cluster 1 are all larger than those for cluster 0. Students in 
cluster 1 show confidence in self-expression. The usage counts for recent course, viewing folders and 
deleting posts for cluster 0 are all close to those for cluster 1 . “Recent course” means that students check the 
latest course status on the LMS. “Viewing folders” means that students download the necessary files, e.g., 
teaching plans and material, which are placed in folders on the LMS. “Deleting posts” means that students 
delete their post after adding a post on a forum. In summary, students in both groups download the necessary 
teaching material files, check the latest course information and delete their posts on a forum. 

4.3 Activity Association Rules for Student Clusters 

Table 3 shows the association rules between activities for cluster 0, i.e., from activity (X) to activity (Y). 
Students fill out the questionnaire, add discussions on a forum and then view discussions and search for the 
clinical questions on the platform. After adding a discussion on the forum, they view resources to find the 
answer to a clinical question. The resources are the teaching material files, which are provided by the clinical 
teacher. It is interesting that students in cluster 0 often remember to logout from the system and they pay 
more attention to the security of personal information than students in cluster 1. 

Table 3. Association rules for activities for cluster 0 


Cluster ID 

Activity (X) 


Activity (Y) 



recent 

-> 

user 

update 


course 



forum 

view forum 


view 


resource 

view 





user 

login, logout, update, view all 





forum 

view forum 


feedback 

complete 

> 

resource 

view 





user 

login, logout, update, view all 


folder 

view 


user 

logout, update 

0 




forum 

view forum 


add discussion 


resource 

view 





user 

login, logout, update, view all 





forum 

view forum 


forum 

add post 

> 

resource 

view 




user 

login, logout, update, view all 



delete post 


user 

logout, update 





forum 

view forum 



view discussion 


resource 

view 





user 

login, logout, update, view all 


Table 4 shows the association rules between activities for cluster 1, i.e., from activity (X) to activity (Y). 
Students in cluster 1 often view a uniform resource locator (URL) and resource after viewing the course and 
completing feedback. The URL is a web link to teaching videos on YouTube. Students in cluster 1 review the 
lesson on the platform after it is taught. In addition, students in cluster 1 keep in touch with teachers or 
classmates by writing messages. Students in cluster 1 also often surf many websites at the same time and are 
more often compulsorily logged out of the after 5 minutes of no activity. 

Table 3 and Table 4 show that students in both groups view other classmate’s recent profiles, which 
include information on the last login to the system. Students wish to know whether classmates view their 
recent posts and discussions on a forum, after they add posts and discussions on a forum. Students often 
check classmates’ latest login time to the system. They also often update their personal photos on the LMS. 
These are similar social behaviors to those seen on Facebook. It is found that students care about the peer 
status in the collaborative environment. Peer interaction is important during an internship. Students construct 
their own knowledge bases and learn from each other (Wilson and Stacey, 2004). In addition, students often 
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observe other students learning status and wish to attract their attention. When they add a new topic for 
discussion on a forum, they view the posts that other classmates add or update. At the same time, they 
regularly update their photos and personal information on their own home pages. They attract the attention of 
peers for personal and social reasons. 

Table 4. Association rules for activities for cluster 1 


Cluster ID 

Activity (X) 

-* 

Activity (Y) 


course 

recent 


user 

login, update 





message 

write 


course 

view 


resource 

view 



url 

view 





user 

login, update, view all 




> 

message 

write 


feedback 

complete 


resource 

view 

1 

> 

url 

view 





user 

login, update, view all 


folder 

view 

> 

user 

login, update, view all 



add discussion 

> 

user 

login, update, view all 



add post 

> 

user 

login, update, view all 


forum 

delete post 

-► 

user 

login, update 



update post 

-* 

user 

login, update, view all 



view discussion 

-> 

user 

login, update, view all 


5. CONCLUSION 

After cluster analysis, the average usage counts for viewing the course, completing feedback, adding posts, 
updating posts and viewing discussions on a forum for the active group are twice those for the inactive group. 
Therefore, students in the active group often view forums and resources after viewing the course, completing 
feedback, adding posts and viewing discussions on a forum. The student learning behavior patterns for each 
cluster can be derived, based on the association rules between activities for the cluster. Next year, when there 
are new interns, teachers can derive their clusters and association rules, to determine students’ behavior, and 
using their behaviors and the association rules, give proper guidance and resources, to allow students to be 
taught in accordance with their aptitude. However, this study has some limitations. Some students were not 
familiar with the e-learning platform. In the future, the e-learning platform will be introduced before the 
internship. 
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